"I am borrowing ya mixing ?" An Analysis of English-Hindi Code Mixing in Facebook
نویسندگان
چکیده
Code-Mixing is a frequently observed phenomenon in social media content generated by multi-lingual users. The processing of such data for linguistic analysis as well as computational modelling is challenging due to the linguistic complexity resulting from the nature of the mixing as well as the presence of non-standard variations in spellings and grammar, and transliteration. Our analysis shows the extent of Code-Mixing in English-Hindi data. The classification of Code-Mixed words based on frequency and linguistic typology underline the fact that while there are easily identifiable cases of borrowing and mixing at the two ends, a large majority of the words form a continuum in the middle, emphasizing the need to handle these at different levels for automatic processing of the data.
منابع مشابه
Mainland Chinese Students’ Shifting Perceptions of Chinese-English Code-Mixing in Macao
As a former Portuguese colony, Macao is the only region in China where Cantonese, a variety of Chinese, and English, an international language, are enjoying de facto official statuses, with Putonghua being a quasi-official language and Portuguese being another official language. Recently, with an increasing number of Mainland Chinese students crossing the border to pursue their tertiar...
متن کاملPOS Tagging of English-Hindi Code-Mixed Social Media Content
Code-mixing is frequently observed in user generated content on social media, especially from multilingual users. The linguistic complexity of such content is compounded by presence of spelling variations, transliteration and non-adherance to formal grammar. We describe our initial efforts to create a multi-level annotated corpus of Hindi-English codemixed text collated from Facebook forums, an...
متن کاملDialogism amid Heteroglossia of the Translinguistic Process of Relexification: The Subversion of Colonial Cultural and Linguistic Imperialism
Most postcolonial African writers choose English as the language of their literary works for the reason of wider audience reception but come to indigenize it to decolonize the colonial tool, i.e. colonial language. The translinguistic process of relexification means subverting colonial cultural imperialism and colonial linguistic imposition through the dialogic interaction opened in the w...
متن کاملThe Effects of Oral Code-mixing and Glossing on Iranian EFL Learners' Vocabulary Knowledge
The current study investigated the effects of oral code-mixing and glossing on L2 vocabulary learning. To this end, 60 EFL learners studying at pre-university school were given a pre-test to make sure that they did not have any prior knowledge of the target words. Based on their scores in the pre-test, 36 pre-university students were selected and divided into three groups, including two experim...
متن کاملSentiment Analysis of Code-Mixed Indian Languages: An Overview of SAIL_Code-Mixed Shared Task @ICON-2017
Sentiment analysis is essential in many real-world applications such as stance detection, review analysis, recommendation system, and so on. Sentiment analysis becomes more difficult when the data is noisy and collected from social media. India is a multilingual country; people use more than one languages to communicate within themselves. The switching in between the languages is called code-sw...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014